Skip to content

Form read write split#442

Draft
aolivier23 wants to merge 20 commits intoFramework-R-D:mainfrom
aolivier23:form_read_write_split
Draft

Form read write split#442
aolivier23 wants to merge 20 commits intoFramework-R-D:mainfrom
aolivier23:form_read_write_split

Conversation

@aolivier23
Copy link
Contributor

This PR splits form_interface into two components: form_reader_interface and form_writer_interface. This split saves FORM time by avoiding lock contention between read and write processes and prepares for RNTuple support that does not have members for an RNTupleWriter when only reading from a container.

All of the layers that power the former form_interface are split in this way. A handful of functions shared between reading and writing have been made free functions in form::experimental::detail.

@aolivier23
Copy link
Contributor Author

@phlexbot format

@github-actions
Copy link
Contributor

Format Fixes Applied

✅ clang-format fixes pushed (commit d586ade)
✅ cmake-format fixes pushed (commit 0e84f5f)
✅ ruff fixes pushed (commit ee7bae1)
✅ YAML formatter fixes pushed (commit 23e120b)

⚠️ Note: Some issues may require manual review.

@greenc-FNAL
Copy link
Contributor

Review the full CodeQL report for details.

@codecov
Copy link

codecov bot commented Mar 23, 2026

Codecov Report

❌ Patch coverage is 83.49515% with 68 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
form/root_storage/root_tbranch_read_container.cpp 76.78% 8 Missing and 5 partials ⚠️
form/root_storage/root_tbranch_write_container.cpp 75.47% 9 Missing and 4 partials ⚠️
form/root_storage/root_ttree_write_container.cpp 50.00% 6 Missing and 6 partials ⚠️
form/storage/storage_reader.cpp 82.60% 4 Missing and 4 partials ⚠️
form/storage/storage_writer.cpp 84.78% 4 Missing and 3 partials ⚠️
form/util/factories.hpp 77.77% 0 Missing and 4 partials ⚠️
form/persistence/persistence_reader.cpp 84.21% 2 Missing and 1 partial ⚠️
form/root_storage/root_ttree_read_container.cpp 70.00% 2 Missing and 1 partial ⚠️
form/form/form_reader.cpp 83.33% 1 Missing and 1 partial ⚠️
form/root_storage/demangle_name.cpp 71.42% 1 Missing and 1 partial ⚠️
... and 1 more
@@            Coverage Diff             @@
##             main     #442      +/-   ##
==========================================
+ Coverage   85.00%   85.33%   +0.33%     
==========================================
  Files         127      145      +18     
  Lines        3341     3431      +90     
  Branches      574      583       +9     
==========================================
+ Hits         2840     2928      +88     
+ Misses        306      303       -3     
- Partials      195      200       +5     
Flag Coverage Δ
unittests 85.33% <83.49%> (+0.33%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
form/form/form_reader.hpp 100.00% <100.00%> (ø)
form/form/form_writer.cpp 77.77% <100.00%> (ø)
form/form/form_writer.hpp 100.00% <100.00%> (ø)
form/form_module.cpp 89.36% <100.00%> (+0.23%) ⬆️
form/persistence/ipersistence_reader.hpp 100.00% <100.00%> (ø)
form/persistence/ipersistence_writer.hpp 100.00% <100.00%> (ø)
form/persistence/persistence_reader.hpp 100.00% <100.00%> (ø)
form/persistence/persistence_writer.cpp 100.00% <100.00%> (ø)
form/persistence/persistence_writer.hpp 100.00% <100.00%> (ø)
form/root_storage/root_tbranch_read_container.hpp 100.00% <100.00%> (ø)
... and 26 more

... and 3 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3bda620...c47a8c6. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

}

m_pers = form::detail::experimental::createPersistenceReader();
m_pers->configureOutputItems(output_config);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Reader?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're correct about this. Will review and probably change. Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs a rename but can't be eliminated without rewriting more of PersistenceReader. We use m_output_items to figure out what file and technology to use when making a Token to get row ID.


form_reader_interface::form_reader_interface(config::output_item_config const& output_config,
config::tech_setting_config const& tech_config) :
m_pers(nullptr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest rename

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about output_item_config -> io_item_config? container_config? I think we're going to need it in both the reader and the writer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The location information on data to read should not depend on configuration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the (global) input file-names, a reading job does not need to tell FORM where and what objects are stored, FORM should find them. Whereas on output, configuration does determine which outputs are written where.

config::tech_setting_config const& tech_config) :
form_writer_interface::form_writer_interface(config::output_item_config const& output_config,
config::tech_setting_config const& tech_config) :
m_pers(nullptr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest rename

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will investigate. I agree that it shouldn't change. Thank you for catching this.

virtual void configureTechSettings(
form::experimental::config::tech_setting_config const& tech_config_settings) = 0;

virtual void configureOutputItems(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I expect that I can remove everything dealing with output config from the reader-flavor components. I'll report back if it doesn't work for some reason.

}
}

void ROOT_TBranch_Read_ContainerImp::setFile(std::shared_ptr<IStorage_File> file)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for reading?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one I think has some reasonable future applications. We might want to read multiple input files in the same job in the future like the old framework does. It might well be a no-op right now though.

Why do you think we should remove it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once you have a container (on read), wouldn't it know its file?

return;
}

void ROOT_TBranch_Read_ContainerImp::setParent(std::shared_ptr<IStorage_Read_Container> parent)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure whether we need this for read.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will investigate, thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure that we would split associative... Do we need them on read and write?

@aolivier23
Copy link
Contributor Author

@phlexbot format

@github-actions
Copy link
Contributor

Format Fixes Applied

✅ clang-format fixes pushed (commit c47a8c6)

⚠️ Note: Some issues may require manual review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants